A82.TEX[358,RWF] - www.SailDart.org

perm filename A82.TEX[358,RWF] blob sn#867925 filedate 1989-01-03 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	\input macro[1,mps]
C00021 ENDMK
C⊗;
\input macro[1,mps]
\magnification\magstephalf
\baselineskip 14pt
\leftline{\sevenrm A82.Tex[358,mps]\today}
\leftline{\copyright\sevenrm Robert W. Floyd, November 30, 1988}
\leftline{\sevenrm Unpublished draft}
\def\fnc#1{\mathop{\rm #1}\nolimits}
\def\pg{\buildrel +\over <}
\def\tl{\buildrel \times\over <}
\def\peq{\buildrel +\over =}
\def\teq{\buildrel \times\over =}
\def\gd{\mathrel{\hbox{$\cdot$\kern-4pt$>$}}}
\rm

\bigskip
\noindent{Chaitin's Theory}

\bigskip
Gregory Chaitin, a self-taught mathematical computer scientist, has developed
a theory of the computational complexity of finite objects such as strings and
integers. Because of careful choices of definitions, this theory is simpler
and more powerful than predecessors by Kolmogoroff and Solomonoff. Chaitin's
theory provides a satisfactory definition of the difficult concept of
randomness. It connects computability theory to theories of probability
and information.

A {\sl Chaitin machine\/} can be thought of as a deterministic Turing Machine
or equivalent, plus a read-only left-to-right input tape and a
write-only left-to-right output tape. If Chaitin
machine~$M$ reads string~$s$ completely and prints string~$t$, we say
$\phi↓M(s)=t$. The set of pairs $\langle s,\phi↓M(s)\rangle$ is recursively
enumerable, and has the {\sl prefix property\/}; no $s$ is a prefix of any other.
On the other hand, any r.e.\ set of ordered pairs $\{\langle s↓i,t↓i\rangle\}$
where $s↓i$ is not a prefix of $s↓j$ for any $i$, $j$, is the input/output
relation of a Chaitin machine. 
[Drill: Prove it.]
We assume ordinarily a two character
alphabet $\{0,1\}$. We also assume Church's Thesis: for any intuitively
computable partial function with the prefix property, we can construct
a Chaitin machine. 

Rather than having separate theories of strings and numbers, we identify
them as shown below:
$$\vcenter{\halign{\ctr{#}\qquad&\ctr{#}\cr
Number&String\cr
\noalign{\smallskip}
1&$\Lambda$\cr
2&0\cr
3&1\cr
4&00\cr
5&01\cr
6&10\cr
7&11\cr
8&000\cr
etc.\cr}}$$
A number $n$ corresponds to the $n$th string in lexicographic order. A string
corresponds to a non-zero
number by prefixing a 1~and assuming binary notation. 
An obvious
mapping provides a number system with a zero, or even negative and
rational numbers, if needed. The length of~$s$, interpreted as a string, is
the same as $\lfloor\log↓2(s)\rfloor$ when $s$ is
interpreted as a number, so we use $\lg(s)$ unambiguously for both
the length of a string and the discrete logarithm of a number.

A {\sl prefix code\/} is a finite or infinite set of strings with the prefix
property. The {\sl Kraft inequality\/} asserts that the lengths of the strings
in a prefix code satisfy $\sum↓i2↑{-\lg(s↓i)}≤1$.

{\sl Proof\/}: Identify each string $s$ with the interval of binary fractions
whose most significant bits are~$s$. The measure of the interval is
$2↑{-\lg(s)}$. The intervals are disjoint, so the sum of the measures is
at most one. Obviously, the domain of a Chaitin machine is a prefix code,
so the Kraft inequality applies.

\smallskip
For any Chaitin machine $M$, let $\mu↓M(t)$ be the minimum string $s$
such that $\phi↓M(s)\downarrow$ (that is, $\phi↓M(s)$ is defined) and
$\phi↓M(s)=t$. Since range $\mu↓M$ is a prefix code, it satisfies
the Kraft inequality. We define the {\sl complexity\/} of string~$t$ for
machine~$M$ by $C↓M(s)=\lg\bigl(\mu↓M(s)\bigr)$.

A Chaitin machine $U$ is {\sl strongly universal\/} if for every Chaitin
machine~$M$ there is a string $P↓{M,U}$, called a {\sl program\/} for $M$
on~$U$, such that after reading $P↓{M,U}$, $U$~behaves like
$M$: $\phi↓U(P↓{M,U}\otimes s)=\phi↓M(s)$. Here $\otimes$ is string
concatenation, and the equality is assumed to hold for a given $s$
if both sides are undefined. A machine~$U$ is {\sl universal\/} if
$$(\forall M)(\exists K)(\forall s)\bigl(C↓U(s)≤C↓M(s)+K\bigr)\,.$$
Obviously, a strongly universal machine is universal. Where possible,
we will not assume strong universality. By Church's thesis, we may assume
we have some (strongly) universal machine~$U$.

Define the relation $f\pg g$ for functions $f$ and $g$, by
$$(f\pg g)≡(\exists K)(\forall s)\bigl(f(s)<g(s)+K\bigr)\,;$$
the relation symbol can be read ``is less than a constant plus''. The
relation is a partial order (reflexive and transitive), inducing an
equivalence relation
$$(f\peq g)≡\bigl(f\pg g)∧(g\pg f)\bigr)$$
which can be read ``is within an additive constant of''.

By the definition of universality, if $U$ is universal, $C↓U\pg C↓M$ for
all~$M$; if $U'$ is also universal, $C↓U\peq C↓{U'}$. We will normally
try to prove bounds on complexities only to within additive constants,
so we may assume a fixed standard universal machine~$U$. We will allow
omission of $U$ as a subscript; $C(s)$, the complexity of~$s$, is 
$C↓U(s)$, and $\mu(s)$, the least program for~$s$, is $\mu↓U(s)$.
Also, $P↓M=P↓{M,U}$.

We define a successively more efficient sequence of prefix codes for the
set of all strings:
$$\eqalign{E↓0(s)&=0↑s1\cr
E↓{i+1}(s)&=E↓i\bigl(\lg(s)\bigr)s\,.\cr}$$
For each $i$, there is a Chaitin machine $M↓i$ with 
$\phi↓{M↓i}\bigl(E↓i(s)\bigr)=s$;
$C↓{M↓i}(s)=1+\lg s+\lg↑2s+\cdots +\lg↑{i-1}s+2\lg↑is$, showing that
for each~$i≥1${}
$$C(s)\pg \lg s+\lg↑2s+\cdots +\lg↑{i+1}s+2\lg↑is\,.$$

\smallskip
\noindent{\sl Lemma\/}: For each $i≥1$, the sequence of numbers
$$x↓s=\lg s+\lg↑2s+\cdots +\lg↑is$$
violates the Kraft inequality.

\smallskip
Proof is by induction. For $i=1$,
$$\sum↓{s=1}↑∞2↑{-\lg s}=\sum↓{n=1}↑∞\,\sum↓{s=2↑{n-1}}↑{2↑n-1}2↑{-n}=
\sum↓{n=1}↑∞2↑{n-1}\cdot 2↑{-n}=∞\,.$$
The induction step is
$$\eqalign{\sum↓{s=1}↑∞&\,2↑{-\lg s+\lg↑2s+\cdots +\lg↑{i+1}s)}\cr
\noalign{\smallskip}
&\qquad =\sum↓{n=1}↑∞\,\sum↓{s=2↑{n-1}}↑{2↑n-1}2↑{n-1}\cdot 2↑{-n}\cdot
2↑{-(\lg n+\lg↑2n+\cdots +\lg↑in)}\cr
\noalign{\smallskip}
&\qquad ={1\over 2}\sum↓{n=1}↑∞2↑{-(\lg n+\lg↑2n+\cdots +\lg↑in)}=∞\,.\cr}$$
Because of this lemma, we can not have $C(s)\pg \lg s+\lg↑2s+\cdots +\lg↑ns$
for any~$n$. Naturally, there may be infinitely many individual values
of~$s$ of complexity much less than this formula; for example,
$s=2↑{2↑k}$ has complexity $\pg k=\lg\lg s$.

If $t=f(t↓1,t↓2)$ for a recursive function~$f$, $C\bigl(f(t↓1,t↓2)\bigr)
\pg C(t↓1)+C(t↓2)$. A program for $t$ achieving this bound is
$P↓M\otimes\mu(t↓1)\otimes\mu(t↓2)$, where machine~$M$ executes this
algorithm:

\smallskip
\display 30pt:(1):
Simulate $U$, reading $\mu(t↓1)$, saving its result as $t↓1$.

\smallskip
\display 30pt:(2):
Simulate $U$, reading $\mu(t↓2)$, saving its result as $t↓2$.

\smallskip
\display 30pt:(3):
Compute and print $f(t↓1,t↓2)$.

\smallskip\noindent
An important consequence is the subadditivity of concatenation:
$$C(t↓1\otimes t↓2)\pg C(t↓1)+C(t↓2)\,.$$

Define the {\sl probability\/} Pr$↓M(t)$ of output $t$ for machine~$M$
as the sum of $2↑{-\lg s}$ over the set of~$s$ for which $\phi↓m(s)=t$.
By the Kraft inequality, $\sum↓t\hbox{Pr}↓M(t)≤1$. Clearly,
$\hbox{Pr}↓M(t)≥2↑{-C↓M(t)}$, so $C↓M(t)≥-\log↓2\hbox{Pr}↓M(t)$.
Surprisingly, 
$$C(t)\pg \log↓2\hbox{Pr}(t)\,.$$
(Proof omitted here.)

By analogy to $\pg$ and $\peq$, define relations which order functions
to within a multiplicative constant by
$$\eqalign{(f\tl g)&≡(\exists K>0)(\forall s)\bigl(f(s)<Kg(s)\bigr)\,.\cr
(f\teq g)&≡(f\tl g)∧(g\tl f)\,.\cr}$$
By exponentiating a $\pg$ relation, we get a $\teq$ relation;
conversely, taking logarithms in a $\teq$ relation, we get a $\pg$
relation.

\smallskip\noindent
{\sl Theorem\/}:
$$\sum↓{\lg(t)=n}\hbox{Pr}(t)\teq\hbox{Pr}(n)\,.$$

Proof: Define a machine $M$ which on input $s$ computes $\lg\bigl(\phi↓U(s)\bigr)$.
Then
$$\sum↓{\lg(t)=n}\hbox{Pr}(t)=\hbox{Pr}↓M(n)\tl {\rm Pr}(n)\,.$$
On the other hand, define $M'$ which on input $\mu(\lg t)t$ prints~$t$.
$\hbox{Pr}↓{M'}(t)≥2↑{-\lg\mu(\lg t)}\cdot 2↑{-\lg t}$, or
$\hbox{Pr}(t)\tl 2↑{-\lg\mu(n)}\cdot 2↑{-n}$, where $n=\lg t$.
Notice that $2↑{-\lg\bigl(\mu(n)\bigr)}=2↑{-C(n)}\tl \hbox{Pr}(n)$.
Summing over the values of $t$ of length~$n$,
$$\sum↓{t=\lg n}\hbox{Pr}(t)\tl \hbox{Pr}(n)\cdot 2↑{-n}\cdot
2↑n=\hbox{Pr}(n)\,.$$

Exercise: Show that for each $n$ the fraction of strings of length~$n$
with complexity less than $C(n)+n-k$ is $\pg 2↑{-k}$.
On the other hand, $C(t)\pg C(\lg t)+\lg t$, so we can say that
for ``most'' strings $C(t)\peq C\bigl(\lg(t)\bigr)+\lg t$.

Show as a consequence that almost all (a set of measure~1)
infinite strings have prefixes
with complexity within a constant bound of the ``typical'' complexity
$\lg t+C(\lg t)$.

\medskip
\noindent(***** RWF: redo next paragraph *****)

\smallskip
By using the recursive function $\lg$, we find 
$$\hbox{Pr}(n)\teq \sum\hbox{Pr}(s)\mid\lg(s)=n\,.$$
This allows a deduction that for ``most''~$s$, $C(s)\peq \lg(s)+
C\bigl(\lg(s)\bigr)$. What is probability that for some
prefix~$s↓i$ of a random infinite string,
$$C(s↓i)<\lg(s↓i)-K\,?$$
(Show $<2↑{-K}$.) So most infinite strings have no non-random prefixes.
Define an infinite string as random if $C(s↓i)\gd\lg(s↓i)$. Define a finite
string~$s$ as random$↓K$ if $C(s)>\lg(s)+C\bigl(\lg(s)\bigr)-K$.
The probability that string~$s$ of a given length is not random$↓K$ is at
most~$2↑{-K}$.
More generally, take any recursive measure of string probability. Then
the complexity will be less than $-\lg({\rm probability})-K$ 
for a set of measure at most~$2↑{-K}$.

If $r$ is a partial recursive function (prf) of one argument, define
its minimal program $\mu(r)=\bigl(\min v\mid\forall s\,U(vs)=r\bigl(
U(s)\bigr)\bigr)$, and its complexity $C(r)=\lg\bigl(\mu(r)\bigr)$.
Then $C(r↓1\circ r↓2)≤C(r↓1)+C(r↓2)$. 

Proof: Let $v↓1=\mu(r↓1)$, $v↓2=\mu(r↓2)$;
$U(v↓1v↓2s)=r↓1\bigl(U(v↓2s)\bigr)=r↓1\bigl(r↓2\bigl(U(s)\bigr)\bigr)
=[r↓1\circ r↓2]\bigl(U(s)\bigr)$, so
$\mu(r↓1\circ r↓2)≤v↓1\otimes v↓2$. Also, $C\bigl(r(s)\bigr)≤
C(r)+C(s)$ by an analogous proof.

An interesting special case: if $r$ is the identity function, 
$\mu(r)=\Lambda$, $C(r)=0$. This shows that the minimal programs
for prfs do not satisfy the Kraft inequality.

A complexity pair is a pair $\langle s,n\rangle$ such that
$n=C(s)$. Let $s$ be any r.e.\ set of complexity pairs. Define
a machine~$M$ which enumerates the members of $S$ until finding
a pair $\langle s,n\rangle$ such that $n>\lg(M)$; it then
prints $s$ and stops. If it stops, we have the contradiction
$$C(s)≤\lg(M)<n=C(s)\,.$$
Thus it never stops, and no pair $\langle s,n\rangle\in S$
has $n>\lg(M)$. There are only finitely many strings of complexity
$≤\lg(M)$, so any r.e.\ set of complexity pairs is finite. In
effect, this means that no theory of complexity pairs can do better
than providing a finite list.

\smallskip
Any infinite string $v$ has infinitely many prefixes of length $L$
with complexity no more than $L+O(\lg\lg L)$.

\smallskip
Proof. Pick an integer $n$. Let $s$ be the first $n$ characters of~$v$,
and $t$ be the next $s$ characters of~$v$. Build a machine~$M$ which,
on input $\mu(s)\otimes t$, will simulate $U$ giving~$s$, and then read
the next $s$~characters as~$t$, printing $s$ and~$t$. Then
$\mu(st)≤P↓M\otimes \mu(s)\otimes t$, and $C(st)\pg (\lg s+2\lg\lg s)+\lg t
=\lg(st)+2\lg\lg s≤\lg(st)+2\lg\lg\lg(st)$; let $L=\lg(st)$, and we
are done. The theorem can be strengthened to show, for any~$n$, infinitely
many prefixes of length $L+O(\lg↑nL)$. We should not, therefore, have
any standard for randomness of finite strings stronger than
$C(t)>\lg(t)$, otherwise the prefixes of 
any random string would fail the standard infinitely often.

\end